-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop stream 2024-09-12 #462
base: develop
Are you sure you want to change the base?
Conversation
This prevents CCCL/thrust's build breakage because of v2.4.0 changes
@AlexVlx any opinions about the HIPSTDPAR changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, thanks for doing this! There are a few nits / things I'd like to see changed though, so please consider them. Cheers!
|
||
hipError_t status; | ||
|
||
status = rocprim::nth_element( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather we didn't do things like so, if possible, but rather simply kept the ::std:: overloads as a very thin forwarder to thrust i.e. this should just forward to thrust::nth_element(...)
, like the other algorithms did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason behind forwarding to rocPRIM's algorithms is that currently rocThrust does not have support for the new Parallel STL algorithms that have been added to rocPRIM, as neither NVIDIA's Thrust nor CUB have implemented those yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be somewhat less cognitively challenging to have something like a stub thrust::__nth_element(...)
call (we'd have to add the stubs, but you're already splitting into impl headers, no?) with a // TODO: this'll go away once something happens
rather than having inline prim for a few algorithms, whilst most others forward to the higher level thrust interface. You'll note that the originals tried to keep away from directly doing any hip*
stuff, and trying to do only C++ level handling before forwarding to Thrust. I am concerned with some problematic future, where some algorithms just doing a bunch of rocprim stuff inline because why not, some others keeping the forward to Thrust approach and, possibly, a 3rd category doing something else:) Perhaps I'm the odd person out with this preference / concern, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late answer. After discussing it a little bit internally, we are unsure whether adding stubs to rocThrust would be ideal: the interface of Thrust is based on that of the C++ Standard Library, but is not guaranteed to be exactly the same, so adding new API entries to rocThrust without knowing how Thrust would do it means that we may need to change this interface later (when Thrust adds these algorithms), which would be a breaking change.
If the only reason for adding stubs to rocThrust is removing the calls to rocPRIM from hipstdpar, then it's probably not worth the risk. Perhaps you have other worries about this that we are failing to see?
Also, about this:
I am concerned with some problematic future, where some algorithms just doing a bunch of rocprim stuff inline because why not, some others keeping the forward to Thrust approach and, possibly, a 3rd category doing something else:)
I don't think it should be a concern, as the missing algorithms are exclusively being added to rocPRIM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't really follow what the risk is. We're not subcontractors for the Thrust development team, so having additional interfaces wouldn't be a problem, if we were to go in that direction. Furthermore, perhaps the point is lost, but what I am suggesting is to do something along the lines of thrust::__nth_element_placeholder_stub_do_not_use_danger_will_robinson
, where the latter holds the current rocprim calls. The definition for this can and should live in the implementation header. It's primarily a matter of symmetry and code consistency, no other algorithm calls Prim inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, so there would still be calls to rocPRIM from the hipstdpar implementation headers, only that the algorithms themselves would appear as if they call to rocThrust (but they call the stubs), right? If that's the case I see no problem adding this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that's it; then, when / if upstream adds these, we can just go and delete the double underscore prefix and the impl. This prevents folks (future maintainers) from having to ask themselves "should I just use PRIM here, or should I call Thrust", and at a glance makes it obvious we always call Thrust from the overloads. Thank you for taking the time to go through this, it's highly appreciated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the changes for this! Please check if it fits your idea for the use of stubs.
|
||
hipError_t status; | ||
|
||
status = rocprim::partial_sort( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem.
|
||
hipError_t status; | ||
|
||
status = rocprim::partial_sort_copy(nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ibidem.
* \brief <tt>Batch operations</tt> implementation detail header for HIPSTDPAR. | ||
*/ | ||
|
||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing the split is fine, since it aligns with Thrust convention, I guess; however, IMHO it'd be better if the master header (this), which is the object of implicit inclusion / is part of the public interface, would do nothing but include the split headers, as opposed to having any actual functionality. We could also probably guard the interpose_allocations.hpp
one with its respective macros. Finally, we probably want to error on the impl headers if they're directly included without __HIPSTDPAR__
being defined, as that would indicate user error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions about these suggestions:
IMHO it'd be better if the master header (this), which is the object of implicit inclusion / is part of the public interface, would do nothing but include the split headers, as opposed to having any actual functionality
Do you mean that we should further split the master headers (the ones named after the C++ Standard Library categories for algorithms)?
We could also probably guard the interpose_allocations.hpp one with its respective macros.
What macros do you refer to? IINM the interpose_allocations.hpp
is already guarded by the __HIPSTDPAR__
and
__HIPSTDPAR_INTERPOSE_ALLOC__
macros (as it was before splitting hipstdpar_lib.hpp
).
Finally, we probably want to error on the impl headers if they're directly included without HIPSTDPAR being defined, as that would indicate user error.
And for this, I don't fully understand why would we need to error here. Originally there was no error handling in case that macro was not defined in the hipstdpar_lib.hpp
, so I'm not sure why it would be done for the impl headers. Also, as I had understood it, the forwarding header was not intended to be included directly by users, but implicitly included by the compiler when specifying the --hipstdpar
option (which also triggers the definition of the __HIPSTDPAR__
macro). So if this is true (please correct me if not), then it wouldn't make sense to check for the definition of this macro IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that we should further split the master headers (the ones named after the C++ Standard Library categories for algorithms)?
Not quite, what I meant is that the master header would pretty much be a list of #include
s for the impl headers, with no declarations / definitions or, really, any substantive code living there. In essence, the most we'd do here is check for environmental macros / possibly declare some future config macros, if necessary. Does that make sense?
What macros do you refer to? IINM the
interpose_allocations.hpp
is already guarded by the__HIPSTDPAR__
and__HIPSTDPAR_INTERPOSE_ALLOC__
macros (as it was before splittinghipstdpar_lib.hpp
).
What I was suggesting as a possible alternative is to check for __HIPSTDPAR_INTERPOSE_ALLOC__
in the master hipstdpar_lib.hpp
header and do conditional include of the impl header in there, rather than guarding within it. This ties both to the point above re: what to do with the master header and to the one below about considering error-ing out.
And for this, I don't fully understand why would we need to error here. Originally there was no error handling in case that macro was not defined in the
hipstdpar_lib.hpp
, so I'm not sure why it would be done for the impl headers. Also, as I had understood it, the forwarding header was not intended to be included directly by users, but implicitly included by the compiler when specifying the--hipstdpar
option (which also triggers the definition of the__HIPSTDPAR__
macro). So if this is true (please correct me if not), then it wouldn't make sense to check for the definition of this macro IMHO.
Whilst the vote of confidence is highly appreciated, I'd not take the original header to necessarily be complete / the best way to do things:) To wit, whilst the forwarding header was / is not meant to be included by users, that does not mean users won't do it (accidentally / intentionally). Error-ing out if the macro is not defined i.e. compilation for hipstdpar
was not desired would guard against accidents, and will at least force intentional abusers to manually define the macro, which'd help with eventual debugging. I suggested it because I think it might be a time saver / inform users about things potentially going awry. Ultimately, it's up to you if you want to go in this direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite, what I meant is that the master header would pretty much be a list of #includes for the impl headers, with no declarations / definitions or, really, any substantive code living there. In essence, the most we'd do here is check for environmental macros / possibly declare some future config macros, if necessary. Does that make sense?
Oh alright, I was confused because the comment was made on impl/batch.hpp
. But then, the master header hipstdpar_lib.hpp
is already just a macro definition check (__HIPSTDPAR__
) and a bunch of includes (for the impl/
headers)
What I was suggesting as a possible alternative is to check for HIPSTDPAR_INTERPOSE_ALLOC in the master hipstdpar_lib.hpp header and do conditional include of the impl header in there, rather than guarding within it. This ties both to the point above re: what to do with the master header and to the one below about considering error-ing out.
Whilst the vote of confidence is highly appreciated, I'd not take the original header to necessarily be complete / the best way to do things:) To wit, whilst the forwarding header was / is not meant to be included by users, that does not mean users won't do it (accidentally / intentionally). Error-ing out if the macro is not defined i.e. compilation for hipstdpar was not desired would guard against accidents, and will at least force intentional abusers to manually define the macro, which'd help with eventual debugging. I suggested it because I think it might be a time saver / inform users about things potentially going awry. Ultimately, it's up to you if you want to go in this direction.
Regarding those two points, the final logic should be then:
- Check in
hipstdpar_lib.hpp
if__HIPSTDPAR__
is defined and include theimpl
headers if so, exceptinterpose_allocations.hpp
which only will be included if__HIPSTDPAR_INTERPOSE_ALLOC__
is also defined. - Inside each implementation header, check if
__HIPSTDPAR__
is defined and err if it's not. - Additionally, in
interpose_allocations.hpp
, check if__HIPSTDPAR_INTERPOSE_ALLOC__
is defined and err if not.
If that sounds good I'll add a commit with the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds great, thanks and apologies for adding work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added the err logic discussed before
@AlexVlx Hi! I answered your comments about the hipstdpar modifications, as I was the one responsible for those. Thanks for reviewing this :) |
@AlexVlx just to check, are we good with the latest changes done to the hipstdpar headers? Just to know if more changes are needed or we can move on with this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the hipstdpar
piece, many thanks for doing this!
Apologies for the delay on getting the PR approved. Currently it looks like with the newer mainline images/compiler, there is a build error in rocThrust, but it's a compiler error not a rocThrust error. I will submit a bug report to the compiler team. I am able to build this a with 6.3 compiler and the tests pass in my limited testing, so I will merge this into develop for now. |
Actually, wait. I'm trying with a different compiler, this time from a 6.2.1 RC2 build, and im getting build errors that seem legitimate: |
I had a look at it. We do use 6.2.1 in our CI in which it works. Was able to reproduce this on a machine. This has to do with TBB that is used in the hipstdpar. Did find somewhere that it has to use libstdc++11 or newer for oneTBB to work. Was not able to find a fix. Will discuss with @Beanavil on Monday. |
@Beanavil I'm not sure if your latest commit to disable TBB tests changes anything, I still seeing the same error while trying to build rocThrust unit tests. What does TBB_TEST control? I do not see its use anywhere. I will upload a docker image where I still see build errors. |
Hi @stanleytsang-amd, the latest two commits do two things:
I was also able yesterday to access the pipeline output and the docker images to reproduce the errors, I give more details about what I could debug in my answer to your email. |
--benchmark_format
from not working in benchmarks.The pipeline fails, because the version of rocPRIM used is not including this commit yet.
Closes #478.